<!DOCTYPE html><html lang="en" data-theme="light"><head><meta charset="UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,initial-scale=1"><title>CVTE实习 | 面试博客</title><meta name="author" content="Luo Jiehao"><meta name="copyright" content="Luo Jiehao"><meta name="format-detection" content="telephone=no"><meta name="theme-color" content="#ffffff"><meta name="description" content="CVTE 中央研究院 文档图像分析与识别小组实习经历">
<meta property="og:type" content="article">
<meta property="og:title" content="CVTE实习">
<meta property="og:url" content="https://luo_13.gitee.io/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/index.html">
<meta property="og:site_name" content="面试博客">
<meta property="og:description" content="CVTE 中央研究院 文档图像分析与识别小组实习经历">
<meta property="og:locale" content="en_US">
<meta property="og:image" content="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg">
<meta property="article:published_time" content="2021-03-07T12:48:41.000Z">
<meta property="article:modified_time" content="2021-03-08T03:19:47.886Z">
<meta property="article:author" content="Luo Jiehao">
<meta name="twitter:card" content="summary">
<meta name="twitter:image" content="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg"><link rel="shortcut icon" href="/interview/img/favicon.png"><link rel="canonical" href="https://luo_13.gitee.io/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/"><link rel="preconnect" href="//cdn.jsdelivr.net"/><link rel="preconnect" href="//busuanzi.ibruce.info"/><link rel="stylesheet" href="/interview/css/index.css"><link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free/css/all.min.css" media="print" onload="this.media='all'"><script>const GLOBAL_CONFIG = { 
  root: '/interview/',
  algolia: undefined,
  localSearch: undefined,
  translate: undefined,
  noticeOutdate: undefined,
  highlight: {"plugin":"highlighjs","highlightCopy":true,"highlightLang":true},
  copy: {
    success: 'Copy successfully',
    error: 'Copy error',
    noSupport: 'The browser does not support'
  },
  relativeDate: {
    homepage: false,
    post: false
  },
  runtime: '',
  date_suffix: {
    just: 'Just',
    min: 'minutes ago',
    hour: 'hours ago',
    day: 'days ago',
    month: 'months ago'
  },
  copyright: undefined,
  lightbox: 'fancybox',
  Snackbar: undefined,
  source: {
    jQuery: 'https://cdn.jsdelivr.net/npm/jquery@latest/dist/jquery.min.js',
    justifiedGallery: {
      js: 'https://cdn.jsdelivr.net/npm/justifiedGallery/dist/js/jquery.justifiedGallery.min.js',
      css: 'https://cdn.jsdelivr.net/npm/justifiedGallery/dist/css/justifiedGallery.min.css'
    },
    fancybox: {
      js: 'https://cdn.jsdelivr.net/npm/@fancyapps/fancybox@latest/dist/jquery.fancybox.min.js',
      css: 'https://cdn.jsdelivr.net/npm/@fancyapps/fancybox@latest/dist/jquery.fancybox.min.css'
    }
  },
  isPhotoFigcaption: false,
  islazyload: false,
  isanchor: false
}</script><script id="config-diff">var GLOBAL_CONFIG_SITE = { 
  isPost: true,
  isHome: false,
  isHighlightShrink: false,
  isToc: true,
  postUpdate: '2021-03-08 11:19:47'
}</script><noscript><style type="text/css">
  #nav {
    opacity: 1
  }
  .justified-gallery img {
    opacity: 1
  }

  #recent-posts time,
  #post-meta time {
    display: inline !important
  }
</style></noscript><script>(win=>{
    win.saveToLocal = {
      set: function setWithExpiry(key, value, ttl) {
        if (ttl === 0) return
        const now = new Date()
        const expiryDay = ttl * 86400000
        const item = {
          value: value,
          expiry: now.getTime() + expiryDay,
        }
        localStorage.setItem(key, JSON.stringify(item))
      },

      get: function getWithExpiry(key) {
        const itemStr = localStorage.getItem(key)

        if (!itemStr) {
          return undefined
        }
        const item = JSON.parse(itemStr)
        const now = new Date()

        if (now.getTime() > item.expiry) {
          localStorage.removeItem(key)
          return undefined
        }
        return item.value
      }
    }
  
    win.getScript = url => new Promise((resolve, reject) => {
      const script = document.createElement('script')
      script.src = url
      script.async = true
      script.onerror = reject
      script.onload = script.onreadystatechange = function() {
        const loadState = this.readyState
        if (loadState && loadState !== 'loaded' && loadState !== 'complete') return
        script.onload = script.onreadystatechange = null
        resolve()
      }
      document.head.appendChild(script)
    })
  
      win.activateDarkMode = function () {
        document.documentElement.setAttribute('data-theme', 'dark')
        if (document.querySelector('meta[name="theme-color"]') !== null) {
          document.querySelector('meta[name="theme-color"]').setAttribute('content', '#0d0d0d')
        }
      }
      win.activateLightMode = function () {
        document.documentElement.setAttribute('data-theme', 'light')
        if (document.querySelector('meta[name="theme-color"]') !== null) {
          document.querySelector('meta[name="theme-color"]').setAttribute('content', '#ffffff')
        }
      }
      const t = saveToLocal.get('theme')
    
          if (t === 'dark') activateDarkMode()
          else if (t === 'light') activateLightMode()
        
      const asideStatus = saveToLocal.get('aside-status')
      if (asideStatus !== undefined) {
        if (asideStatus === 'hide') {
          document.documentElement.classList.add('hide-aside')
        } else {
          document.documentElement.classList.remove('hide-aside')
        }
      }
    })(window)</script><meta name="generator" content="Hexo 5.4.0"></head><body><div id="sidebar"><div id="menu-mask"></div><div id="sidebar-menus"><div class="author-avatar"><img class="avatar-img" src="/interview/null" onerror="onerror=null;src='/img/friend_404.gif'" alt="avatar"/></div><div class="site-data"><div class="data-item is-center"><div class="data-item-link"><a href="/interview/archives/"><div class="headline">Articles</div><div class="length-num">4</div></a></div></div></div><hr/></div></div><div class="post" id="body-wrap"><header class="post-bg" id="page-header" style="background-image: url('https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg')"><nav id="nav"><span id="blog_name"><a id="site-name" href="/interview/">面试博客</a></span><div id="menus"><div id="toggle-menu"><a class="site-page"><i class="fas fa-bars fa-fw"></i></a></div></div></nav><div id="post-info"><h1 class="post-title">CVTE实习</h1><div id="post-meta"><div class="meta-firstline"><span class="post-meta-date"><i class="far fa-calendar-alt fa-fw post-meta-icon"></i><span class="post-meta-label">Created</span><time class="post-meta-date-created" datetime="2021-03-07T12:48:41.000Z" title="Created 2021-03-07 20:48:41">2021-03-07</time><span class="post-meta-separator">|</span><i class="fas fa-history fa-fw post-meta-icon"></i><span class="post-meta-label">Updated</span><time class="post-meta-date-updated" datetime="2021-03-08T03:19:47.886Z" title="Updated 2021-03-08 11:19:47">2021-03-08</time></span></div><div class="meta-secondline"><span class="post-meta-separator">|</span><span class="post-meta-pv-cv"><i class="far fa-eye fa-fw post-meta-icon"></i><span class="post-meta-label">Post View:</span><span id="busuanzi_value_page_pv"></span></span></div></div></div></header><main class="layout" id="content-inner"><div id="post"><article class="post-content" id="article-container"><h3 id="实习内容"><a href="#实习内容" class="headerlink" title="实习内容"></a>实习内容</h3><ol>
<li>负责身份证检测识别系统中的文字检测部分  </li>
<li>负责班级优化大师中的花名册检测识别系统中的文字检测部分  </li>
</ol>
<p><img src="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/%E7%BB%93%E6%9E%9C.jpg" alt="结果">  </p>
<h3 id="工作介绍"><a href="#工作介绍" class="headerlink" title="工作介绍"></a>工作介绍</h3><p>上述实习内容的两个部分在实习最后都整理成了相同的系统，故只对其中一个进行介绍。总的来说，在实习过程中主要负责的是图片中的印刷文字检测（图片来自各种拍摄场景或者屏幕截图、手机截图）。   </p>
<p><img src="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/%E6%B5%81%E7%A8%8B.jpg" alt="流程">  </p>
<h3 id="工作难点"><a href="#工作难点" class="headerlink" title="工作难点"></a>工作难点</h3><ol>
<li>缺少真实数据  </li>
<li>文本较为密集，容易产生文本粘连的情况  </li>
<li>实时性要求比较高  </li>
<li>数据标注耗时  </li>
</ol>
<h3 id="功能实现"><a href="#功能实现" class="headerlink" title="功能实现"></a>功能实现</h3><ul>
<li>缺少真实数据  </li>
</ul>
<p>接手任务的时候，并没有太多的真实数据，采用了少量标注真实数据+大量模拟合成数据的方式组成训练集，测试集采用的是真实数据，保证模型训练结果的可靠性，因为真实数据较少，所以整个算法流程只使用了训练集和测试集，并没有采用验证集。  </p>
<ul>
<li>模型调优  </li>
</ul>
<p>在模型上线之前，模型的精度和速度都需要达到设定的要求。  </p>
<p>精度方面：现有测试集上，F1得分（iou=0.7情况下）达到0.7以上。<br>速度方面：在i5-4核的配置下达到单张图片推理速度需要达到100ms。  </p>
<p><strong>精度优化</strong>：为了能在较为简单的情况下获取更多的文本信息（角度、位置、面积等），在项目实现过程中采用了图像分割的方法，而没有使用目标检测的方式。在上述任务文本检测中，通常文字都是密集的，<strong>这种密集的文本在分割过程中容易发生粘连</strong>。为了解决这一问题，当时使用Tensorflow复现了<a target="_blank" rel="noopener" href="https://arxiv.org/pdf/1806.02559.pdf">PSENet</a>，这一方法通过预测不同大小的文本区域，并且从最小概率发生粘连的最小文本区域逐步向大文本区域进行广度优先搜索，从而获得了完整的文本区间。  </p>
<p><img src="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/PSENet.jpg" alt="PSENet">   </p>
<p>同时，为了提高感受野，采用了<a target="_blank" rel="noopener" href="https://arxiv.org/pdf/1805.10180.pdf">PAN</a>中的FPA结构。  </p>
<p><img src="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/FPA.jpg" alt="FPA">   </p>
<p>在实现过程中还引入了两种注意力机制进一步提升精度，分别是<a target="_blank" rel="noopener" href="https://arxiv.org/pdf/1805.10180.pdf">PAN</a>中的GAU结构，以及<a target="_blank" rel="noopener" href="https://arxiv.org/pdf/1709.01507.pdf">SENet</a>中的自注意力机制。GAU结构用于特征金字塔的融合，SENet只用在了分割网络中的decoder部分，且将SENet中的FC层换成了1*1卷积以降低计算量。  </p>
<p><strong>速度优化</strong>：整个流程的时间损耗可以分为两个部分，<strong>模型推理耗时和后处理耗时</strong>。<br>模型推理耗时主要从模型结构进行优化，尝试过可分离卷积以及<a target="_blank" rel="noopener" href="https://openaccess.thecvf.com/content_ECCV_2018/papers/Changqian_Yu_BiSeNet_Bilateral_Segmentation_ECCV_2018_paper.pdf">BiSeNet</a>等，发现时间还是不能控制在要求范围内，最后是通过暴力裁剪网络层数和通道数来减少检测时间，具体做法是在训练模型之前先使用TF的profile工具测试模型每个部分的耗时，有针对性地进行裁剪，得到时间满足要求的模型之后再训练网络。  </p>
<p><img src="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/profile_2.png" alt="profile">    </p>
<p>模型的后处理主要工作是从二值化图像中找出文本区域，并且根据规则筛选不合格的文本区域，最后将矫正的文本区域分割出来并且传入下一步的文字识别模块。  </p>
<p>在花名册名字检测中，文本的实例数通常比较大，经常会出现100以上的实例，后处理过程中存在较多的循环判断，使用python进行处理耗时比较大（测试过程中发现opencv的python接口也比C++接口慢），为了减少后处理时间，将这一部分内容通过pybind11写成了C++库，最后通过python调用生成的C++动态库。记得的提升时间是挺客观的，但是具体的提升忘记了是多少。  </p>
<ul>
<li>模型上线 &amp; 半自动标注  </li>
</ul>
<p>模型上线之后就是一个循环往复的过程，不停的收集数据，更新模型，收集的数据大部分会被加入到测试集中，从而保证线下结果的可靠性。在实习的过程中，因为项目没有外包标注的打算，所以只能自行标注，为了减轻工作量，在<a target="_blank" rel="noopener" href="https://github.com/tzutalin/labelImg">labelimg</a>的基础上，修改了一个半自动标注工具，导入数据后，可以允许使用存储的推理模型，推理的结果将会以标注的格式保存下来，同时，只需要在上面微调结果即可以完成标注，节省了很多时间。  </p>
</article><div class="post-copyright"><div class="post-copyright__author"><span class="post-copyright-meta">Author: </span><span class="post-copyright-info"><a href="mailto:undefined">Luo Jiehao</a></span></div><div class="post-copyright__type"><span class="post-copyright-meta">Link: </span><span class="post-copyright-info"><a href="https://luo_13.gitee.io/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/">https://luo_13.gitee.io/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/</a></span></div><div class="post-copyright__notice"><span class="post-copyright-meta">Copyright Notice: </span><span class="post-copyright-info">All articles in this blog are licensed under <a target="_blank" rel="noopener" href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC BY-NC-SA 4.0</a> unless stating additionally.</span></div></div><div class="tag_share"><div class="post-meta__tag-list"></div><div class="post_share"><div class="social-share" data-image="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" data-sites="facebook,twitter,wechat,weibo,qq"></div><link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/social-share.js/dist/css/share.min.css" media="print" onload="this.media='all'"><script src="https://cdn.jsdelivr.net/npm/social-share.js/dist/js/social-share.min.js" defer></script></div></div><nav class="pagination-post" id="pagination"><div class="prev-post pull-left"><a href="/interview/2021/03/07/%E6%B3%B0%E7%A7%91%E7%94%B5%E5%AD%90AICUP%E6%AF%94%E8%B5%9B/"><img class="prev-cover" src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="onerror=null;src='/interview/img/404.jpg'" alt="cover of previous post"><div class="pagination-info"><div class="label">Previous Post</div><div class="prev_info">泰科电子AI CUP比赛</div></div></a></div><div class="next-post pull-right"><a href="/interview/2021/03/07/RoboMaster/"><img class="next-cover" src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="onerror=null;src='/interview/img/404.jpg'" alt="cover of next post"><div class="pagination-info"><div class="label">Next Post</div><div class="next_info">Robomaster</div></div></a></div></nav></div><div class="aside-content" id="aside-content"><div class="card-widget card-info"><div class="card-info-avatar is-center"><img class="avatar-img" src="/interview/null" onerror="this.onerror=null;this.src='/interview/img/friend_404.gif'" alt="avatar"/><div class="author-info__name">Luo Jiehao</div><div class="author-info__description"></div></div><div class="card-info-data"><div class="card-info-data-item is-center"><a href="/interview/archives/"><div class="headline">Articles</div><div class="length-num">4</div></a></div></div><a class="button--animated" id="card-info-btn" target="_blank" rel="noopener" href="https://github.com/xxxxxx"><i class="fab fa-github"></i><span>Follow Me</span></a></div><div class="card-widget card-announcement"><div class="item-headline"><i class="fas fa-bullhorn card-announcement-animation"></i><span>Announcement</span></div><div class="announcement_content">This is my Blog</div></div><div class="sticky_layout"><div class="card-widget" id="card-toc"><div class="item-headline"><i class="fas fa-stream"></i><span>Catalog</span></div><div class="toc-content"><ol class="toc"><li class="toc-item toc-level-3"><a class="toc-link" href="#%E5%AE%9E%E4%B9%A0%E5%86%85%E5%AE%B9"><span class="toc-number">1.</span> <span class="toc-text">实习内容</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#%E5%B7%A5%E4%BD%9C%E4%BB%8B%E7%BB%8D"><span class="toc-number">2.</span> <span class="toc-text">工作介绍</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#%E5%B7%A5%E4%BD%9C%E9%9A%BE%E7%82%B9"><span class="toc-number">3.</span> <span class="toc-text">工作难点</span></a></li><li class="toc-item toc-level-3"><a class="toc-link" href="#%E5%8A%9F%E8%83%BD%E5%AE%9E%E7%8E%B0"><span class="toc-number">4.</span> <span class="toc-text">功能实现</span></a></li></ol></div></div><div class="card-widget card-recent-post"><div class="item-headline"><i class="fas fa-history"></i><span>Recent Post</span></div><div class="aside-list"><div class="aside-list-item"><a class="thumbnail" href="/interview/2021/03/07/%E7%99%BD%E4%BA%91%E6%9C%BA%E5%9C%BA%E9%81%93%E8%B7%AF%E6%83%85%E5%86%B5%E7%9B%91%E6%8E%A7/" title="白云机场道路情况监控"><img src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="this.onerror=null;this.src='/interview/img/404.jpg'" alt="白云机场道路情况监控"/></a><div class="content"><a class="title" href="/interview/2021/03/07/%E7%99%BD%E4%BA%91%E6%9C%BA%E5%9C%BA%E9%81%93%E8%B7%AF%E6%83%85%E5%86%B5%E7%9B%91%E6%8E%A7/" title="白云机场道路情况监控">白云机场道路情况监控</a><time datetime="2021-03-07T12:48:43.000Z" title="Created 2021-03-07 20:48:43">2021-03-07</time></div></div><div class="aside-list-item"><a class="thumbnail" href="/interview/2021/03/07/%E6%B3%B0%E7%A7%91%E7%94%B5%E5%AD%90AICUP%E6%AF%94%E8%B5%9B/" title="泰科电子AI CUP比赛"><img src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="this.onerror=null;this.src='/interview/img/404.jpg'" alt="泰科电子AI CUP比赛"/></a><div class="content"><a class="title" href="/interview/2021/03/07/%E6%B3%B0%E7%A7%91%E7%94%B5%E5%AD%90AICUP%E6%AF%94%E8%B5%9B/" title="泰科电子AI CUP比赛">泰科电子AI CUP比赛</a><time datetime="2021-03-07T12:48:42.000Z" title="Created 2021-03-07 20:48:42">2021-03-07</time></div></div><div class="aside-list-item"><a class="thumbnail" href="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/" title="CVTE实习"><img src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="this.onerror=null;this.src='/interview/img/404.jpg'" alt="CVTE实习"/></a><div class="content"><a class="title" href="/interview/2021/03/07/CVTE%E5%AE%9E%E4%B9%A0/" title="CVTE实习">CVTE实习</a><time datetime="2021-03-07T12:48:41.000Z" title="Created 2021-03-07 20:48:41">2021-03-07</time></div></div><div class="aside-list-item"><a class="thumbnail" href="/interview/2021/03/07/RoboMaster/" title="Robomaster"><img src="https://cdn.jsdelivr.net/npm/butterfly-extsrc@1/img/default.jpg" onerror="this.onerror=null;this.src='/interview/img/404.jpg'" alt="Robomaster"/></a><div class="content"><a class="title" href="/interview/2021/03/07/RoboMaster/" title="Robomaster">Robomaster</a><time datetime="2021-03-07T12:48:40.000Z" title="Created 2021-03-07 20:48:40">2021-03-07</time></div></div></div></div></div></div></main><footer id="footer"><div id="footer-wrap"><div class="copyright">&copy;2020 - 2021 By Luo Jiehao</div><div class="framework-info"><span>Framework </span><a target="_blank" rel="noopener" href="https://hexo.io">Hexo</a><span class="footer-separator">|</span><span>Theme </span><a target="_blank" rel="noopener" href="https://github.com/jerryc127/hexo-theme-butterfly">Butterfly</a></div></div></footer></div><div id="rightside"><div id="rightside-config-hide"><button id="readmode" type="button" title="Read Mode"><i class="fas fa-book-open"></i></button><button id="darkmode" type="button" title="Switch Between Light And Dark Mode"><i class="fas fa-adjust"></i></button><button id="hide-aside-btn" type="button" title="Toggle between single-column and double-column"><i class="fas fa-arrows-alt-h"></i></button></div><div id="rightside-config-show"><button id="rightside_config" type="button" title="Setting"><i class="fas fa-cog fa-spin"></i></button><button class="close" id="mobile-toc-button" type="button" title="Table Of Contents"><i class="fas fa-list-ul"></i></button><button id="go-up" type="button" title="Back To Top"><i class="fas fa-arrow-up"></i></button></div></div><div><script src="/interview/js/utils.js"></script><script src="/interview/js/main.js"></script><div class="js-pjax"></div><script async data-pjax src="//busuanzi.ibruce.info/busuanzi/2.3/busuanzi.pure.mini.js"></script></div></body></html>